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Abstract 

Moment-elosure teehniques are eommonly used to generate low-dimensional de- 
terministie models to approximate the average dynamies of stoehastie systems on 
networks. The quality of sueh elosures is usually diffieult to asses and further¬ 
more the relationship between model assumptions and elosure aeouraey are often 
diffieult, if not impossible, to quantify. Here we earefully examine some eom¬ 
monly used moment elosures, in partieular a new one based on the eoneept of 
maximum entropy, for approximating the spread of epidemies on networks by re- 
eonstrueting the probability distributions over triplets based on those over pairs. 
We eonsider various models (SI, SIR, SEIR and Reed-Frost-type) under Marko¬ 
vian and non-Markovian assumption eharaeterising the latent and infeetious pe¬ 
riods. We initially study with eare two speeial networks, namely the open triplet 
and elosed triangle, for whieh we ean obtain analytieal results. We then explore 
numerieally the exaetness of moment elosures for a wide range of larger motifs, 
thus gaining understanding of the faetors that introduee errors in the approxima¬ 
tions, in partieular the presenee of a random duration of the infeetious period and 
the presenee of overlapping triangles in a network. We also derive a simpler and 
more intuitive proof than previously available eoneerning the known result that 
pair-based moment elosure is exaet for the Markovian SIR model on tree-like net¬ 
works under pure initial conditions. We also extend such a result to all infectious 
models, Markovian and non-Markovian, in which susceptibles escape infection 
independently from each infected neighbour and for which infectives cannot re- 
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gain susceptible status, provided the network is tree-like and initial conditions are 
pure. This works represent a valuable step in enriching intuition and deepening 
understanding of the assumptions behind moment closure approximations and for 
putting them on a more rigorous mathematieal footing. 

Keywords: Pairwise model, SIR epidemie. Maximum Entropy, Pair 
approximation. Approximate dynamics 
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1. Introduction 

Networks are becoming a ubiquitous tool for modelling the interactions be¬ 
tween systems of multiple eomponents with eomplex interactions between them [HI 
SHI. This is partieularly true for epidemie models, where empirieal advanees in 
measurement of relevant interactions are aeting as a particular spur to theoretical 
developments liSlfTTIl. 

One partieular ehallenge for complex network modelling consists in the high 
dimensionality of the dynamical systems. If a network has N nodes, eaeh of whieh 
ean be in one of m states, then the dimensionality of a stochastic process for the 
evolution of those states will be in the absence of a large diserete sym¬ 

metry group for the network or a specifie combination of dynamical and network 
models that allows for analytieal results to be obtained (sueh as SIR dynamies on 
an Erdos-Renyi graph [IT4l l. 

Eor general dynamies and topologies, network moment closure techniques 
provide a eommonly used method of gaining signifieant dimensional reduetion, 
at the priee of losing an exaet deseription of the system dynamies. These clo¬ 
sures are based on the idea of approximating the dynamics of small subgraphs 
in the network (e.g. adjacent pairs of nodes) by foreing their time derivatives to 
depend only on the state of subgraphs of the same or lower dimension instead of 
on the state of larger subgraphs (e.g. triplets), thus deriving a elosed system of 
equations. Every elosure therefore implieitly makes assumptions about the proba¬ 
bility distribution over states of eertain parts of the system, in terms of probability 
distributions over states of smaller parts of the system, although these are often 
not stated explieitly. If we exclude the case of eonstructing joint probabilities 
over pairs of adjaeent nodes as the products of marginals over the single nodes 
(whieh leads to the so-ealled mean-field approximation; see e.g. [|20ll ). the next 
most common moment-closure approximation involves describing the probability 
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of triplets of adjacent nodes being in any possible state based on the knowledge of 
the probabilities over pairs. This has lead to many so-called pair approximation 
models, widely used both for theoretical ifTTlI^ and practical [ISEHH purposes. 

Unfortunately, the overall quality of a moment closure approximation is of¬ 
ten difficult to asses, thus severely limiting the generalisability of results based 
on such approaches. Even when proved accurate in certain cases, it is not clear 
whether such accuracy is preserved in other slightly different contexts. Further¬ 
more, moment-closure approximations, often proposed in an ad hoc fashion on 
the basis of heuristic arguments, unavoidably impose assumptions on the inter¬ 
actions between system components. The fact that such assumptions are often 
obscure, and their interaction and impact on the other modelling assumptions are 
far from trivial to unravel, compromises the neatness of the approach and makes 
them somewhat less appealing from a theoretical point of view. 

Consequently, there is significant interest within the research community in 
deepening our understanding of which moment closure approximations are more 
accurate than others, when they fail to reproduce the system dynamics exactly 
and why [f20l [TOl |2ll |23l l22l • As a first step in this direction, a recent trend has 
involved applying moment closures to each specific subgraph of interest in order 
to approximate its dynamics. This framework, on a large network, results in a 
fairly large number of equations. Sharkey [l20ll refers to this modelling approach as 
individual-based or pair-based, depending on whether the aim is to describe only 
the dynamics of each single node or of pairs of nodes. We propose to collectively 
refer to it as local network moment closure. On the other hand, the original and 
most commonly used type of moment closure consists in counting the number of 
subgraphs of interest in any possible configuration at any one time and is where 
the most significant dimensionality reduction is gained IfTO [TTl [8l . Sharkey ll^ 
refers to this other approach as mean-field approximation or pair approximation, 
depending on whether the interest is on single node or pairs of nodes. We suggest 
referring collectively to this framework as global (or population-level) network 
moment closure. 

Scaling up from local to global moment closure introduces a further round 
of approximation, on top of the one already present at the local level. Sharkey 
[1^ points out how this second level of approximation depends on an averaging 
or “mean-field” assumption of homogeneity, the accuracy of which depends pri¬ 
marily on the heterogeneity in the network structure, more than on the dynamical 
errors built in at the local level. Therefore, as a first step in gaining better un¬ 
derstanding of the quality of moment closure approximations in general, here we 
focus only on local moment closures and the dynamical local errors they generate. 
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In the specific context of local moment closure approximations for Susceptible- 
Infected-Recovered (SIR) epidemic models on a network, recent work by Sharkey 
et al. [|2^ has shown that, provided the network has no short loops and initial con¬ 
ditions are pure (i.e. the system start in a specific state with probability 1), the 
standard loopless pair-based local moment closure (see [[20ll ) provides an exact 
description of the dynamics of single nodes and pairs of nodes, from which, for 
example, the expected epidemic course can be obtained exactly. When the net¬ 
work does have short loops, in particular triangles, other closure techniques have 
been proposed. The most common of these is due to Kirkwood see also 
[l2Ql|2Tl), which can often be quite accurate in practice, but lacks solid theoretical 
justification. 

Recently, work has been done to provide more explicit derivations of novel 
moment closures in the presence of closed loops. This has included arguments 
about appropriate early asymptotic behaviour [[Til, non-independent Bernoulli tri¬ 
als [f^ and maximum entropy ifTSlI . It turns out that these are equivalent at ‘first 
order’, but the maximum entropy (ME) approach is more readily generalisable 
and can be used to derive a large variety of moment closures. 

In this paper we carefully investigate the behaviour of various local moment 
closure techniques for reconstructing the behaviour of triplets in terms of pairs on 
networks of increasing size and complexity and try to clarify when and why they 
lack exactness and for which modelling assumptions. 

In Section 1^ we introduce the notation and describe the basic model assump¬ 
tions for all models considered in the paper. In Section]^ we define the moment 
closure approximations studied and we propose a different and possibly more in¬ 
tuitive interpretation of the ME approximation. In Sections and we focus 
on the SIR model on the simplest possible network topologies, namely an open 
triplet and a closed triangle, and show how the behaviour of the moment closures 
change when changing the assumptions about the distribution of the infectious 
period. In particular, in Section we show that for such simple structures, all 
the moment closure approximations that we consider here are exact when the in¬ 
fectious period has a constant duration. When the duration is random, as is the 
case for the Markovian model, the closure is in general only approximate, al¬ 
though the most important quantities for the open triplet are still captured exactly. 
In Section we explore the convergence for all approximations to the exact re¬ 
sults as the variance in the duration of the infectious period tends to 0, using a 
family of non-Markovian epidemic models with Erlang-distributed durations of 
the infectious period. Eurthermore, we highlight the overall superior accuracy of 
the moment closure technique based on ME, but shed light on its context-specific 
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limitations in comparison with the other elosures. In Seetion|7]we explore how 
results extend to slightly larger struetures and build up the intuition about when 
the elosure eonsidered here are exaet on larger networks. Sueh intuition is then 
diseussed in Seetion]^ where we eonjeeture how the errors introdueed by moment 
elosure behave on larger networks and also prove that the standard pair-based lo- 
eal approximation is exaet on tree-like networks with pure initial eonditions for 
all models eonsidered here, thus extending and simplifying result already known 
for the SIR Markovian epidemie model. 

2. General framework 

2.1. Labelled network 

We eonsider an undireeted statie network ^ whieh has a size-// 

set of nodes jY and a set of links . Nodes are denoted by z, 7 ,... G and 
{i,j} G =2^ if and only if i and j are eonneeted to eaeh other (and we use the 
eonvention {z, z} ^ .if). 

At any time t, eaeh node z is labelled by a state Xi{t) G kl, where Q is a set 
of states that depends on the epidemie model run on the network (see below; for 
example Q. = {S,I,R}). We will assume throughout that the network strueture is 
not affeeted by the states of its nodes. 

Let X{t) = (Ai (t), A 2 (t),... ,Xiv(t)) be a veetor deseribing the random state of 
the system at time t. We denote by x = (xi,X 2 ,...,xaz) {xi G Q.,i = 1,2,...,//) a 
speeifie system state, and let x° = denote the initial state. Then the 

state of the system at eaeh time t > 0 is deseribed by the probability distribution 

P*0(x;^) =P(X(t) =X|X(0) =X®) . (1) 

Note that, in general, a proeess is not fully speeified by its marginal distributions 
over time{^ However, for our purposes Equation ([T]) is suffieient. 

If we are interested in the state of a subsystem, we first eonsider the set y c 
,yV of all indiees of the nodes we are interested in. Upon ehoosing a referenee 


^For example, the three-state Markov chains with generator matrices 

/-I 1/2 l/2\ /-I 1/2 l/2\ 

Ml = 0 0 0 and M 2 = 0 -1 1 

\ 0 0 0 J \0 I -I J 

have the same marginal distributions at any time, if they both start in the first state. However, their 
behaviour is different. 
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ordering, thus replacing the set 'Y with a vector V, we then consider the vectors 
Xv and xy which contain only the elements of X and x with indices in V. Applying 
the subscript V can be thought as a projection on the subspace identified by V of 
the A-dimensional space . By definition, 

p;:? {xv;t) = P (Xv(t) = XVI X(0) = xO) (2) 

is obtained by summing ([T]) over all indices not appearing in V. Note that the 
initial conditions should remain specified on the full graph. 

2.2. Epidemic models 

We are interested in the spread of an epidemic on the static network described 
above. We consider different epidemic models, namely an SI, an SIR, an SEIR 
and a Reed-Frost model. In all models, the epidemic spreads by infective (/) nodes 
transmitting the infection to susceptible (S) neighbours. 

2.2.1. SI model 

In the SI model, Q. = {5,/}. Upon infection, node i makes infectious contacts 
to each one of its neighbours at the points of a homogeneous Poisson process 
with rate T > 0. A contacted node, if susceptible, becomes infectious. Therefore, 
the epidemic results in the infection of all nodes in the connected components 
containing at least one initial infective node, and every infective ultimately infects 
all of its neighbours. 

2.2.2. SIR model 

In the SIR model, Q. = {S,I,R}. Upon infection, node i is assumed to ex¬ 
perience an infectious period of random (non-negative) duration 7)-, and during 
its infectious period, it makes infectious contacts with each one of its neighbours 
at the points of a homogeneous Poisson process with rate T > 0. A contacted 
node, if susceptible, becomes infectious, and at the end of the infectious period, 
the node recovers (R) and becomes permanently immune to the infection. The in¬ 
fectious periods and Poisson processes associated with different infectious nodes 
are assumed to be mutually independent; similarly, the Poisson processes from 
the same infectious node towards different neighbours are mutually independent, 
conditionally on its infectious period. We assume that for all i, the random vari¬ 
ables Ti are independent and identically distributed (iid) according to a random 
variable T with mean by mj = E [T]. Without loss of generality, we assume in all 
numerical examples that mj = 1 and, unless stated otherwise, that T = 1. 
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2.2.3. SEIR model 

In the SEIR model, Q. = {S,E,I,R}. This model is similar to the SIR one, 
with the additional presence of a latent period (E) following the infection of each 
node i, of duration L/. During the latent period, a node cannot transmit the in¬ 
fection and will eventually progress to the infectious stage. The latent periods of 
different nodes are iid according to a random variable L with mean = E[L], 
and are assumed to be independent of all infectious periods and Poisson processes 
describing infectious contacts, irrespective of whether they refer to the same node 
or different nodes. 

2.2.4. Reed-Erost-type models 

In the standard Reed-Frost (RF) model, Q. = {S^E^R}. Upon infection, node 
i experiences a latent period, at the end of which it spreads all its infectivity at 
a single point in time and then recovers permanently. We consider extensions of 
the standard Reed-Frost model to both a random duration of the latent period and 
random probabilities of transmission. More specifically, we assume that node f’s 
latent period has duration L,- (iid for different nodes according to L, with mean m^) 
and we denote by Pj the random probability with which node i can infect each one 
of its neighbours. All the P,s are iid according to a random variable P, with mean 
p = E,[P], and are independent of latent periods, whether referring to the same or 
to different nodes. Note that the infections of different neighbours by node i are 
not independent events, but are independent conditionally on the value of Pi. In 
the literature, the term Reed-Ewst model refers only to the case where the latent 
period is of fixed duration and P is non-random (i.e. L = mL and P = p), and the 
term randomised Reed-Erost model refers to a constant latent period L^mi and 
a random probability of transmission P. Here, therefore, we refer to all possible 
combinations of random L and P as Reed-Erost-type models. Note that Reed- 
Frost-type models can be viewed as limiting cases of SEIR models where my —)■ 0 
while T —)■ oo, such that the mean probability of transmission p = xmr is kept 
constant, for suitably chosen distributions for the sojourn times in states E and I. 

2.2. Moment closures 

A moment closure, a say, is a rule for the generation of a probability distribu¬ 
tion for a set Y of nodes in from the probability distributions over subsets of 
Y. To avoid trivial cases, we implicitly assume that the subgraph identified by Y 
(i.e. consisting of the nodes in Y and all and only the edges between nodes in Y) 
is connected. Again, we find it easier to specify an order for the nodes in Y, thus 
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effectively listing them in a vector V. Whether or not a specific moment closure 
is exact for some t >0, i.e. whether 

=P^‘’(xv;t) , (3) 

in general depends on the particular choice of xy and x®. So, in what follows, with 
the notation x® —)■ x^, we generically refer to the investigation of the evolution of 
the system from the initial state x° to the state xy. 

If clear from context, the initial condition x® will often be removed. We will 
also drop the explicit dependence on t, implicitly assuming that equalities are 
meant to hold for all t > 0 and inequalities are meant to indicate that the corre¬ 
sponding equality fails for at least one value of t > 0 . 

The main focus of this paper is on pair-based approximations to epidemic dy¬ 
namics on graphs of size 3, i.e. on local moment closures where the probability 
of the vector V of three nodes being in any possible configuration is reconstructed 
from the probability of single nodes and pairs of nodes in V being in any pos¬ 
sible configuration. Various common possible choices are carefully described in 
Section m 

Given that in this context V will have < 3 indices and we are often interested 
in indicating them explicitly, we will further simplify the notation by writing, for 
example for V = (1,2,3), 

Pi 23(A5C) instead of P(i 2 , 3 )(^- 8 C) and 

instead of P(i^2,3),a(^^0 {A,B,CeD.). 

If obvious from the context which vector V of three nodes is under consideration, 
we will often simply denote the probability over V as F{ABC). 

2.4. Explicit topologies 

In order to develop understanding of the impact of the assumptions behind 
moment closure approximations, we consider numerous simple topologies. These 
small networks are presented in Figure However, we first begin with a careful 
study of the behaviour of the considered pair approximations in the context of the 
SIR model on the simplest possible graphs, namely the open triplet and closed 
triangle. These are the focus of Sections El S and § on which then the other 
sections are built. 

In both the open triplet and the closed triangle we have JV — {1,2,3}, but 
different sets of links: 

^ope„ = {{l,2},{2,3}}, ^closed = {{1,2},{2,3},{3,1}}. (4) 


8 


Figure shows the states and transitions of these explicit models - even for these 
small networks, there is a lot of dynamical structure for any moment closure to 
capture. 


3. Moment closure approximations 

In this section we illustrate all the moment closures we consider in this anal¬ 
ysis. Note that any vector V of three distinct nodes of a connected graph either 
forms an open triplet or a closed triangle. Also, without loss of generality, we 
assume throughout this section that V = (1,2,3). 


3.1. Unclustered closure 

In the literature, the most common moment closure approximation of triplets 
in terms of pairs is obtained following the naive idea of multiplying the probability 
of every pair of nodes linked by an edge, and then dividing by the probability of 
nodes common to pairs of edges 1^121 • On an open triplet, assuming that i = 2 
is the central node, this approximation, hereafter denoted by o, is defined as 


P«(A5C)=Pi23,o(A5C) 


Pi2(A^)P23(^C) 

P2(5) 


(5) 


3.2. Kirkwood closure 

On a closed triangle, the same approach leads to the following approxima¬ 
tion, popularised in epidemic modelling by [ITTI and sometimes attributed to Kirk¬ 
wood fH^ (see also [[2011211), which we denote by K\ 


^k{ABC) 


Pi2(A^)P23(^C)Pi3(AC) 

Pi(A)P2(5)P3(C) 


( 6 ) 


Kirkwood’s approximation has the natural property of being symmetric in A,.6 
and C but it is not always a proper distribution over system states (i.e. sometimes 
'La.b,c^K{a^c) 7 ^ 1) and it does not always agree with the marginals it is con¬ 
structed from (i.e. £cPk-(A5c) is in general different from Pi2(A5); see [[T^[2T1 ). 


3.3. Maximum entropy 

In order to overcome these limitations, Rogers [flSl recently suggested con¬ 
structing an approximation based on the principle of Maximum Entropy (ME), 
which we here denote by /i. In our context, this means that the quantity 

E ^ F^{abc)ln{F^{abc)) , (7) 

a,b,c 
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which is the information entropy of the distribution P^, is maximised subject 
to the eonstraints imposed by the marginals {P;(A),P;j(A5)}, i.e. that Pi (A) = 
JPjU (Ahc) and Pi2(A5) = J^^.P^(A5c) (and similarly for all other nodes or 
pairs). For the open triplet, the closure @ is the ME distribution. For the closed 
triangle there is no closed-form solution, although following Rogers [fT^ Eq. 4], 
we know that a set of funetions {qtj}, {ij} G -^dosed^ exists sueh that the ME 
distribution ean be written in produet form 

¥^{ABC) = qn{.AB)q2^{BC)q^,{CA) . ( 8 ) 

These funetions are not, however, straightforwardly related to the marginal prob¬ 
abilities and so an alternative approaeh is preferable for explicit calculations. 


3.4. Iterative scaling 

Rogers ifT^ provides an iterative seheme to ealeulate the ME distribution: start 
with the uniform distribution p(®)(x) = l/|f^P over all possible system states 
X G and eycle through all the three pairs (in any order; we choose the order 
El = (l,2),y2 = (2,3),y3 = (l,3))to obtain, forn = 0,1,2,...: 


pW’^i(ARC) =Pi 2(A5) 
pW’^2(ARC) =P23(RC) 

p(«+l)(^5C) =p(«),V3(^5q =Pj3('^q. 


^{ABC) 


p(")Ai(AfiC) 
I„g^pWTi(a5C) ’ 


(9) 


Rogers [fT^ cites results from Csiszar and Shields |I2| to argue that the sequenee 

P(°) {ABC ),..., {ABC ), P(")’^2 (^j^BC ), p('^+i)’'"i {ABC ),... 


eonverges as n —oo and that the limiting distribution is the ME distribution P^ {ABC ), 
which is known to be unique (provided the marginals are eonsistent). 

If a closed-form approximation is needed, Rogers [|T^ suggests using what is 
obtained from the algorithm after the first (triple) step. Denoting this l-step ME 
approximation with p, we have: 


Pp(ARC) =p(i)(AfiC) 


Pl2(Afi)P23(5C)Pi3(AC) 


P2(5)L, 


^n(Ab)^2i{bC) 

Mb) 


( 10 ) 
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This closure has also been derived independently from different arguments in [I7]| 
and [|23l|. On the open triplet, the 1-step ME approximation ( fTO] ) leads onee more 
to ([^. On the elosed triangle, it overcomes the key limitations of Kirkwood’s, 
i.e. it is a proper distribution over system states and has the eorreet marginals. 
However it depends on the arbitrary ehoices of the starting distribution and the 
order in whieh to eyele through the pairs. On the eontrary, P^(A5C) does not de¬ 
pend on either of these choices [[21 . Note that other distributions are possible, for 
which some or all the requirements above are satisfied (for example, the algebraic 
mean of the six possible forms of l-step ME, one per permutation order through 
the pairs, would be independent of the cyeling order). However, among all distri¬ 
butions, ME is the only one that introduees no additional (and hence unjustifiable) 
information apart from the desired eonstraints, and is therefore the most theoreti- 
eally appealing one. 

We finally suggest a different and, to our knowledge, novel formulation that 
may provide a different point of view of the assumptions underlying maximum en¬ 
tropy. A simple and systematic means of generating triangles (that ean be readily 
extended to other networks) is to derive iteratively a set of functions {qij} through 
the following proeedure. Denoting by q^'j\AB) the approximation of qij{AB) ob¬ 
tained at the n* iteration, we start with qfj\AB) = Fij{AB). Then for the iterative 
step, we define 

= ^'i2(ABWS(BChn(AC) 

Eo,i,,r9*i2* (o^') 

If (ABc) > q ^^2 ’ then q^"^ updated to a new lower value ^^ (AB) 

while, if < ^j2^(A5), then one has to set ^j2^^^(A5) > ^[^^(Afi), 

where the change in value is determined by questions of numerieal efficieney. 
Assuming convergence, we define 

Ql 2 (^B)q 23 (BC)q'^i(CA) 

La,b,c^i2{ab)q23{bc)q3i{ca) 

These {qij} differ from the {qij} in @ by virtue of being probability distributions, 
although elearly ( [TT] ) is identieal to @ if the denominator is absorbed into the 
individual probabilities, and the existenee and uniqueness of q implieitly assumed 
follow from the results of [flSl . 

While sueh an argument offers a different route to the same result, we found 
that the iterative scaling approach outlined in @ above is eomputationally more 
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efficient (in addition to having been proved to converge). An implementation of 
the iterative scaling approach in Matlab is provided as Electronic Supplemen¬ 
tary Material. 

4. Markovian SIR model on three nodes 

Here, as well as in Sections and[^ we specifically focus on the performance 
of pair approximation on the open triplet and the closed triangle, where nodes are 
labelled i= 1,2,3 and / = 2 is the middle node in the triplet. 

The majority of epidemic models appearing in the literature that make use 
of moment closure assume that T ~ Exp( 7 ), for some constant 7 > 0, and are 
therefore fully Markovian. The main reason is mathematical convenience and a 
set of ordinary differential equations is then derived to describe the probability 
of triplets being in each configuration of interest (local moment closure) or to 
describe the average behaviour of the original stochastic model in the limit of a 
infinite population (global moment closure). 

Therefore, in Eigurej^we explore how all approximations above (KjP, and 
/i) compare to the exact probability distributions P* (x;t) on the open and closed 
triangle at time t = 1, for some natural choices of x° and x, when the Marko¬ 
vian model is used and T = 1 and nir = 1. This Eigure shows that, on the closed 
triangle, no approximation is exact for any state. The case of the open triplet is 
more subtle: for example, P* (x;t) is different from P* (x;t) as defined in @ 
when X® = {SIS) and x = (SRS). The reason is that the random duration of the 
infectious period imposes correlations between the two susceptibles even if there 
is no direct link between them. In fact, denote by Q the probability that a sus¬ 
ceptible escapes infection when t — )■ oo. Then, limf^.coP* (x;t) = E [2^] , which is 

in general different from limf_^cx,P* (x;t) = E[2] , except when Q is non-random 
(e.g. constant duration of infection). Intuitively, if individual 2 has recovered 
without infecting individual 1, then it is more likely that the infectious period was 
shorter than expected, which in turn increases the probability that also individual 
3 escaped infection. Therefore, the joint probability that both have escaped in¬ 
fection {F^^^^\SRS)) is higher than that obtained through ([^, where the two are 
assumed to escape infection independently of each other. Eor the same reason, it is 
possible to verify that the ME approximation also underestimates {RRR) and 
overestimates F^^^^^RRS) and F^^^^\SRR). This insight suggests that the quali¬ 
tative features that can be drawn from Eigure]^ are not exclusive to the Markovian 
model, but extend to all models with a random duration of infectious periods. 
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Analogously to the fact that F^^^\sRS) ^ in the presence of a 

random infectious period we also have that (see Figure 

Pi^^^^(///) ^ (///), 


and therefore approximation o on the open triplet fails to be exact for all t > 0 in 
these cases and all those where the system can evolve to from {US) and {III) (e.g. 
{RIS) or {RRI)). However, even in the Markovian case (see Figure [^, we have a 
set of equations that hold true: 


as well as 


¥\^^^\lSS) 

¥f^\lSI) 

¥T^\iSR) 

¥\1^^\rSS) 

¥^o^^\rSI) 

¥^o^^\rSR) 

¥^o^'\lSl) 

¥^o^‘\lSR) 

¥f‘\RSI) 

¥^o^^\rSR) 

¥^o^^\lSR) 

¥\^^’^\rSR) 

¥f^\siS) 

¥f^\uS) 

¥f^\sil) 

¥^o^^\lII) 


p(/55)(;^^) ^ 
p(/55)(;5;) ^ 
p(«5)(/5^) ^ 

P(^^^)(i?55) , 

p(/55)(^5;) ^ 

¥^^^^\RSR) , 
p(^^^)(/5/) , 
p(^^^)(/5i?) , 
¥^‘^^\RSI) , 
¥^‘^‘\rSR) , 

p(«t?)(/5^) ^ 

p(S/5)(5;5) ^ 
p(5/5)(;;5) ^ 
p(S/5)(5;;) ^ 

p(5«)(///) . 


(13) 


(14) 


Intuitively, the results in ( [T?] ) hold because the intermediate case has not recovered 
yet: for any time t at which the closure is studied, we know that the intermediate 
infective has been infectious for a non-random duration t and therefore the events 
of infecting either of the neighbours are independent of each other. On the other 
hand, the results listed in ( [T3| ) hold because the intermediate susceptible rules out 
the presence of any correlation between the two extremes: either 1 cannot infect 


13 


3 (and therefore, e.g. = 0) or, if both are infeetious, their behaviour is 

uneorrelated beeause 2 eseapes infeetion independently from both. Again, both 
these qualitative behaviours transeend the Markovian model itself. More formally, 
foeusing our attention only on states ISS and ISI, we highlight the following result. 

Proposition 1. On an open triplet, 


Po(/5S) =P(/5S) and Po(/S/) = P(/5/) (15) 


for all times t >0, when the initial conditions are = {ISS) or = {ISl). 

Proof. Consider first the ease x® = {ISS) . Clearly P(/5/) = 0, so Po {ISI) = P(/5/) 
holds trivially. For state x = {ISS) instead. 


Po(/55) 


Pi2(/5)P23(55) 

P2(5) 


Beeause the triplet is open, P2(5') = P23(5'S) and Pi 2 (/S) = ¥{ISS), and henee 
Po(/55) = P(/5S). 

Consider now x® = {ISI). Clearly, F{ISS) = 0, as infeeted nodes eannot re- 
eover, so ¥o{ISS) = F{ISS) holds trivially. For state x = {ISI) instead. 


Fo{ISI) 


Pi2(/5)P23(5/) 

P2(5) 


but given the initial eonditions all faetors equal P(/5/) and the elosure is still 
exaet. □ 


This result works for any assumptions about the infeetious period, but is par- 
tieularly important in the Markovian ease beeause it provides the basis for why 
the pair-based approximation for a Markovian SIR epidemie spreading on a more 
general unelustered network, in whieh only the ISS and ISI states appear, is exaet 
(as long as the starting eonfiguration is pure; see [fT9ll ). A formal proof of this first 
appeared in Sharkey et al. [|T9l . However, in Seetion [8T] we will provide a simpler 
and more general proof. 


5. Constant infectious period: SIR model on three nodes 

Although the standard approximation o on the open triplet is always exaet, at 
least for the dynamieally important states ISS and ISI, we have argued that this is 
not the ease for many other states beeause of the random duration of the infeetious 
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period. We now show that the random duration of the infectious period is the main 
reason why all moment closure approximations fail to be exact, both on the open 
triplet and on the closed triangle. In what follows we assume a constant infectious 
period of duration T = and we arbitrarily assume that an individual is still 
infectious at t = nij and is immune immediately after, i.e. for t > mj. 

Proposition 2. For the SIR model on an open triplet, when the infectious period 
has constant duration, 

¥o{ABC)=¥{ABC) , (16) 

for allA,B,C G {S,I,R}, all times t >0, and all initial conditions. 

Proof We consider each initial condition separately. 

(i) Assume x° = {ISS) and recall Q. Because of the initial condition, the proba¬ 
bility of all cases in which the state A of individual 1 is S is 0, i.e. 

P(5BC) = Pi2(5B) = Pi 3(5C) = Pi (5) = 0 

for all B,C G {S,fR}. Now consider separately the cases in which A = I and 
A = R. 

When A = I, P(ABC) = 0 for all t > mr, but also Pi 2 (AB) and so ¥o{ABC) 
are null and ( fT^ holds trivially. Therefore, consider only the times t < mj. Then, 
Pi 2 (i?B) = ¥{RBC) = 0, so that 

P23(BC) = P(5BC) +P(/BC) +P(i?BC) = P(/BC) = P(ABC) 


and 

P2(B) = Pi 2(5B)-fPi2(/B)+Pi2(i?5) = Pi2(/B) = Pi2(AB), 

and thus Po(ABC) = P(ABC). Therefore ( [T^ holds for all t > 0. 

When A = R, the argument is similar. In particular, P(ABC) = 0 for all t < mr, 
but also Pi 2 (AB) and therefore Po(ABC) are null. For t > mj, instead, Pi 2 (/B) = 
¥{IBC) = 0, so that 

P23(BC) = P(5BC) +P(/BC) +P(i?BC) = ¥{RBC) = P(ABC) 


and 

P2(5) =Pi2(5B)4-Pi2(/B)+Pi2(^B) =Pi2(7?B) =Pi2(AB), 
and thus Po(ABC) = P(ABC). Therefore, again, ( [T^ holds for all t > 0. 
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(ii) For all other initial conditions x® in which individual 1 is non-susceptible at the 
start a similar argument to the one above can be used to prove that the proposition 
still holds. 

(iii) When the initial condition is x® = (SSI) or any other in which individual 3 is 
non-susceptible from the start, the proposition also follows by symmetry. 

(iv) For x° = (5/5), the problem is slightly different. First of all we take the stan¬ 
dard convention that Fo{ABC) = 0 for 5 = 5 (i.e. we assume that a ratio is null 
when the numerator is null, irrespective of the value of the denominator). Then, 
for B = I or B = R, the result holds trivially when t > nix or t < nij, respectively. 
When the result is not trivial, it holds because for a constant duration of the infec¬ 
tious period, individual 2 transmits (or has transmitted) independently to 1 and 3, 
so that the joint distribution of the state of pairs (1,2) and (2,3) is the product of 
the marginals. 

(v) All other initial states in which individual 2 is not susceptible at the start follow 

trivially. □ 


Proposition 3. For the SIR model on a closed triangle, when the infectious period 
has constant duration, all moment closure approximations considered here are 
exact, i.e. 

Fk{ABC) = Pp {ABC) = Pp {ABC) = F{ABC) , (17) 

for allA,B,C G {S,I,R}, all times t >0, and all initial conditions. 


Proof. We analyse each moment-closure approximation separately. 


(a) Kirkwood. For the case of Kirkwood’s approximation k, we need to prove 
that: 


Pi2(A^)P23(^C)Pi3(AC) 

Pi(A)P2(5)P3(C) 


P(A5C) . 


( 18 ) 


(i) Consider first the initial condition x° = (755). Because the initial condition, 
the probability of all cases in which the state A of individual 1 is 5 is 0, i.e. 


P(5BC) = Pi2(5B) = Pi 3(5C) = Pi (5) = 0 

for all B,C G {5,/,/?}. Now consider separately the cases in which A = I and 
A = R. 

When A = I, P(ABC) = 0 for all t > m^, but also F\ 2 {AB) and so Pp(ABC) 
are null and ( fT^ holds trivially (we adopted the convention that the indeterminate 
form 0/0 equals 0). Therefore, consider only the times t < mj. Then, Pi 2 (/?B) = 
Pi 3 (/?C) = F{RBC) = 0, so that 


F2{B)=Fn{SB)+Fx2{IB)+Fi2{RB)=Fi2{IB)=Pi2{AB) 
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and, similarly, P 3 (C)/’i 3 (AC). Therefore, 


beeause 


Pct(A5C) 


P23(^C) 

Pi (A) 


P(A5C) , 


P23(5C) = P(55C) +P(/5C) +P(i?5C) = P(/5C) = P(A5C) 


and Pi (A) = 1. Henee, ( [18] ) holds for all r > 0. 

(ii) A similar argument holds for A = R, but now the trivial ease when all proba¬ 
bilities are 0 is when t < mj and the other eonsiderations apply to t > mj. 

(iii) The ealeulations for any other initial eondition in whieh individual 1 is non- 
suseeptible are simply a speeial ease of those given above. 

(iv) The result works for any other initial eondition for symmetry reasons (we ean 
simply define individual 1 to be an initial non-susceptible). 

(b) First-step ME. For the ease of approximation p, obtained by stopping the ME 
algorithm after a single triple step, we need to prove that: 


Pi2(A^)P23(^C)Pi3(AC) 


= P(A5C) 


(19) 


(i) Analogously to before, first assume we start from x® = {ISS) and consider the 
case A = 7 and t < nij (the result is trivial for t > mj). Then: 


- unMC) - ■ 

The case A = R and t > mj is analogous. 

(ii) Given the asymmetry in ( [T9j ), we still need to consider separately the case of 
X® = {SIS) (all the others work as special cases by symmetry). In this case, for 
t < nij, the sum at the denominator of ( [T9| ) contains only the term for b = I, and 
cancels out with the first 2 terms at the numerator (¥ 2 {b) = 1). The result follow 
immediately because P2(5) = 1 and Pi 3 (AC) = P(A5C). 

(c) Full ME. Finally we prove the result for the ME approximation p. We already 
know from (b) above that if we start the ME algorithm from the uniform distribu¬ 
tion P°(A5C) = 1/27, for all A, 5, C G {S,/,7?}, after the first triple step we reach 
P*^^)(A5C) = Pp(A5C) = P(A5C). We now show that, if we apply another triple 
step to P(A5C), we remain on the same distribution P(A5C). In other words. 
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starting from any initial distribution, convergence of the ME algorithm, restricted 
to its output after every triple step, occurs after a single triple step. 

To show this, we expand the first triple step of the algorithm (as in ( [T^ , but 
by keeping explicitly F^^\ABC) in the equations). Considering separately every 
initial condition with one infective and 2 susceptibles, we use the same arguments 
used in other proofs (for example, when x® = (ISS), in the suitable time range, 
we know that P 23 (BC) = P(A5C) and that sums over the state a of individual 1 
contain only the probabilities for a = A) to prove that p(^) {ABC) = F{ABC). □ 

Remark. It is worth mentioning that the proofs of Propositions and above 
require only the initial infective (or any of the initial infectives, if more than one) 
to have a constant duration of the infectious period. Therefore the result readily 
extends to the case in which the three individuals have possibly different durations 
of infection, as long as they are non-random. Furthermore, in the case of the 
SI model, the same arguments can be used to prove that all approximations are 
always exact. 

6. Erlang-distributed infectious period: SIR model on three nodes 

Analytical progress becomes difficult when the infectious period is not con¬ 
stant. Conversely, numerical methods based on continuous-time Markov chains 
are straightforward to implement to study the behaviour of the Markovian model. 
In order to bridge the gap between these two extremes, we extend the framework 
used for Markovian model by allowing infectives to go through a series of in¬ 
fectious stages, each with independently and identically distributed exponential 
infectious periods, to model an overall Erlang-distributed sojourn time in the in¬ 
fectious state, with mean mj = 1. As the number nj of infectious stages increases, 
the variance decreases as 1/n/. The results are reported in Figure]^ for each of 
the three starting points x° (at time t = 0 ), the overall “distance” between the exact 
distribution and the moment-closure approximation is measured by taking, at each 
time t > 0, the sum of squared differences (SSD), [P*°(x;t) — P*'^(x;t) ]^, and 
then integrating it over all times. The choice of SSD to measure the discrepancy 
between distributions instead of the possibly more natural Kullback-Eeibler (KE) 
divergence is appropriate because Kirkwood’s approximation does not lead to a 
proper distribution over system states, thus often yielding negative values for the 
KE divergence that are hard to interpret. 
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6.1. Open triplet 

Figure [^highlights how the exact result for the theoretical limit of a constant 
duration of infection is approached and how such convergence depends on the 
starting state x® and the infection rate T. Note in particular how the slowest con¬ 
vergence seems to be attained for intermediate (though starting-state-dependent) 
values of x and also how the approximation performs particularly poorly for the 
starting condition x® = {SIS). 

Figure [^combines the contribution of all states in an aggregate measure of the 
approximation performance. Decomposing such an aggregate measure, however, 
reveals significant heterogeneity (see Figures SI and S2 of the Supplementary 
Material), with negative and positive errors in different cases. The decomposition, 
however, confirms the presence of cases when the approximation is exact and the 
particularly poor performance when the system starts with the intermediate node 
infected (bottom row of Figures SI and S2 of the Supplementary Material). 

6.2. Closed triangle 

On the closed triangle. Figure [^ describes the overall “distance”, measured 
again by the time-integrated SSD, between each approximation and the exact 
probability distribution over all systems states, for different starting states, as a 
function of the variance in the duration of the infectious period. On the other 
hand. Figure [^expresses the same distance as a function of the infectivity x. It is 
quite evident that, overall, ME performs better than the other approximations, in 
particular Kirkwood’s. However, the rate of convergence seems to be comparable 
to that of other closures. 

Again, assessing the overall quality of each approximation on the closed tri¬ 
angle with such an aggregate measure hides the strong heterogeneity that can be 
seen when decomposing the SSD in the contribution of different states. In gen¬ 
eral (see Figures S3-S6 of the Supplementary Material), ME performs often better 
than Kirkwood’s approximation, although we collect in Figure S7 some of the ex¬ 
treme examples of its variability in performance. In particular, in the somewhat 
trivial case of {ISS) —> {ISS), we have found ME to be consistently less accurate 
than Kirkwood’s approximation over the entire parameters’ spectrum, sometimes 
by almost an order of magnitude. The implications of such heterogeneity on the 
performance of each approximation in any practical context are still unclear and 
require further investigation. In fact, despite ME appearing superior when the 
contribution of all states is equally weighted, when studying the system dynamics 
at the population level, the infection process, the current epidemic phase, as well 
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as the specific network structure, all interact in a complex fashion to produce dif¬ 
ferent proportions of triangles in each state. Uneven weights associated to each 
of the specific transitions x® —>■ x could overturn the current conclusions in some 
cases. In particular, it is not unreasonable to imagine a large proportion of trian¬ 
gles that have started from but not yet left the (ISS) state, a case in which ME 
performs particularly poorly. 

A final comparison between the various approximation methods on the closed 
triangle consists in stratifying the contribution of each state x to the overall SSD 
measure (see Figure S8 in the Supplementary Material). In addition to the quan¬ 
titatively smaller error of the ME approximation compared to the others, its evi¬ 
dently most balanced decomposition both across states and in particular over time 
undoubtedly represents a further element of merit. 

7. Results for other motifs 

In Sections Q and we have focused on the SIR model mainly because 
it is the one that is most commonly considered in the literature. We have also 
restricted our attention to a single open triplet or closed triangle, because they are 
simple enough for analytical results to be obtained. We now show that some of 
these results are peculiar to the particularly simple topologies considered, and in 
particular on the fact that the presence of initially infected cases inside the triplet 
or triangle significantly reduces the degrees of freedom of the system. However, 
such initial careful analysis provides important insight, which guides us in how to 
approach extensions to other motifs and to larger networks, in particular in terms 
of the role that a random duration of the infectious period plays on the exactness 
of various moment closures. 

Analytical proofs become cumbersome as the complexity of the graph in¬ 
creases, and appears particularly difficult to obtain in the case of ME. Therefore, 
we opt for extending the numerical exploration of Section to build some in¬ 
tuition about the errors in moment closure approximations on open triplets and 
triangles that are subgraphs of other slightly larger networks. We note from the 
start that, if the aim is to understand how our results extend to large networks, at 
some point the numerical method we are using needs to be abandoned in favour 
of a dynamical system where one can can deduce the rate of change of the state of 
a pair based on the states of its neighbours [IT9l . 

Although recent results [|T0ll25]l involve the formulation of dynamical systems 
based on time-since-infection approaches, leading to equations with distributed 
delays, by far the easiest starting point for writing a dynamical system for each 
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pair of nodes is to use ordinary differential equations. This requires the use of 
eonstant rates (i.e. Markovian models). However, in Proposition|^we have shown 
that, as soon as we introduee randomness in the duration of the infeetious period, 
moment elosure approximations fail on networks with loops. 

The SI model, apart from being simpler beeause of the presenee of only 2 
states, has the further benefit of being both Markovian and with a “eonstant” (infi¬ 
nite) duration of infeetion. Therefore, we focus on it as our baseline model for ex¬ 
ploring the exactness of moment closure approximations for networks with more 
than three nodes, with the understanding that closures that fail in the SI model 
cannot be exact if latency or recovery are added. 

Figure [^reports a range of motifs of increasing complexity. The three digits 
appended to the end of the motif name are the indices of the nodes (as in Figure 
of the triplet on which the closure is applied. Unless stated otherwise, the initial 
condition is represented by node 1 having just been infected. 

7.1. SI model 

Tablereports a comprehensive list of motifs, based on Figurej^on which the 
moment closures are tested for the SI model. Because of the high dimensionality 
of the exploration, we only report the two dynamically important states {ISS) and 
(757) and, to further enrich our understanding, states (775) and (777). We choose 
to observe the error for those four state at only one time point per motif, for which 
we have checked that the probability of the system being in the most interesting 
states is non-zero. Unless specified otherwise, such time is taken to be t = 0.5 
(to be compared with the time scale mj = 1, which we use in the presence of 
recovery). 

Approximations are divided in three groups, where the first and second assume 
one initially infected node and, respectively, the closure on an open triplet or a 
closed triangle, while the third assumes multiple initially infected nodes. In each 
group, the list of cases examined is further grouped in subsets with the intention 
of testing a particular network feature over small graphs of increasing complexity. 
Table indicates whether each closure approximation is exact or not at the time 
tested (hereafter we will say it “works”), although wherever possible the specific 
time is chosen such that a positive answer is indicative of general validity for all 
times (a negative answer, or “failure”, is of course sufficient to discard exactness). 
Only ME is tested on closed triangles, because if it fails, Kirkwood’s approxima¬ 
tion and 1-step ME also fail. Comments are added to provide extra information 
where appropriate. Areas of grey background are the most useful ones to gain 
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understanding of whether the results of Propositions and seale up to larger 
networks and up to whieh point. 

7.1.1. Open triplet 

We know from Seetionj^and from Proposition that the SI model works on 
the open triplet with a single initial infeetive. We verify that extensions to any tree¬ 
like strueture also work, in line with the results of Sharkey et al. [|T9ll and Theorem 
below. Triplets where the infeetion enters from the eentral node (3star324 and 
Tree324) do not eontribute to the dynamies of the spread on a tree. 

Although the elosure appears to work on loops of size 4 when the system 
starts inside the loop (Square 123 and toastB123), if the system starts outside the 
loop (KiteEmpty234 and FishEmpty345, or KiteDiagB234 and EishDiagB345) 
the elosure fails, thus suggesting none of the elosures examined here extends to 
large networks eontaining loops larger than 3. The behaviour is slightly differ¬ 
ent if the system starts already in the loop but not in the triplet (Square234 and 
ToastA234). However, starting again outside the loop and entering the loop not 
in the triplet (KiteEmpty345, FishEmpty456, KiteDiagA345 and FishDiagA456) 
the elosure fails, eonfirming the impossibility to seale our exaet results on large 
networks with loops of more that 3 nodes. 

7.1.2. Closed triangle 

Proposition [^guarantees that all elosures are exaet on a elosed triangle. This 
appears to be true even for slightly more eomplex networks (ToastA123 and 4Fulll23) 
as long as the system starts in the triangle on whieh the elosure is applied. ME 
seems to be exaet also on elosed triangles when the system does not start within 
the triangle (MartiniGlass234 and BowTie345) and the same behaviour applies 
when the triangle is part of a larger motif (KiteDiagA234 and EishDiagA345, or 
KiteEull234 and EishEull345). Note however that, whenever the system starts out¬ 
side the triangle, both Kirkwood and l-step ME fail. This suggest that neither of 
these elosures seales up to larger networks with triangles and only ME gives hopes 
for sueh an extension. However, as soon as the system is allowed to enter the tri¬ 
angle on whieh the elosure is applied through more than one route (ToastB234 
and all the following networks) even ME fails. 

We further eonfirm our intuition that ME is exaet even in the presenee of over¬ 
lapping triangles, on eondition that the eaeh triangle ean only be entered through 
a single route, by observing that ME works also on two speeial larger networks, 
namely DoubleKite237 and DoubleEish348. 
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Note that, although ME works on the first triangle eneountered in a full elique 
of size 4 (4Fulll23, KiteFull234 and FishFull345), it does not work on other tri¬ 
angles inside the same elique (4Full234, KiteFull345 and FishFull456), therefore 
leaving no hope for any of the elosures investigated here to work on networks eon- 
taining fully eonneeted eliques of size larger than 3. In partieular, this suggests 
that, in the so-ealled households models, where fully eonneeted eliques (house¬ 
holds) are joined by a few between-clique links, the dynamies of infection spread 
on single nodes or pairs (and hence, for example, the expected epidemic course) 
cannot be represented exactly only in term of pairs using any of the moment clo¬ 
sure techniques considered here, unless households have size no larger than 3. 

7.1.3. Multiple initial infectives 

In line with Sharkey et al. [|T9ll and Theorem below, we verify that the the 
closure on the open triplet works on to tree-like networks also with multiple initial 
infectives, on condition that the initial conditions are pure. 

However, on closed triangles, we verify that even ME fails when multiple ini¬ 
tial infectives are present, even if the initial conditions are pure (Tripod234). This 
suggests that even when the maximum household size is 3, the moment closures 
considered here are exact only when a single initial case starts the epidemic. 

7.2. SIR model 

In Proposition!^ and Section]^ we have shown that, if we introduce the possi¬ 
bility of recovering, exactness of any closure on a network that contains some tri¬ 
angles can only be hoped for with a constant duration of infectious period. There¬ 
fore, in Table we report a similar analysis to the one done in Table but for 
the SIR model with constant duration of the infectious period T = mj (identified 
by the letter “C”). The results are not exact but are inferred by visually examining 
the convergence, like in Figure]^ as the number n; of infectious classes increases. 
The potential exactness of the approximation also for the Markovian model with 
exponentially distributed T (denoted by “M”), and therefore for all models with 
T random but not-degenerate, is reported in the comments. 

For ease of comparison with the careful examination presented for the SI 
model, in Table we propose the same structure of Table However, we know 
that, if on a particular graph the closure fails for the SI model, it is bound to fail 
also for the SIR model. Therefore, we only fill in the table partially, leaving aside 
all tests that do not help gaining further understanding. 

On the open triplet. Table [^confirms the exactness of the closure on tree-like 
structures for any infectious period. It also confirms that the closure is not exact 
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on loops of size 4, if the initially infected node is outside the loop, thus suggesting 
no large network with loops larger than triangles admit exact dynamics under the 
pair-based approximations considered. 

However, the results for the MartiniGlass234 and the KiteDiagA234 suggests 
that there is no hope for the studied approximations to be exact even if loops 
only consist of closed triangles, whenever the initially infected node is outside 
the triangle and even if the infectious period is of constant duration. Therefore, 
we conclude that, for SIR models, the pair approximations considered can only 
be exact in general on a tree-1 ike network (see Sharkey et al. liT9ll and Theorem 
below). 

As an example, we show two results of our numerical investigation, one sug¬ 
gesting exactness on the triangle (Figure]^ and one suggesting failure to be exact 
on the MartiniGlass234 graph (Figure [^. 

7.3. SEIR 

Given the negative result obtained for the SIR model, we do not expect the 
SEIR to perform any better. For this reason we only report our results in Table 
SI in the supplementary material. Results are mostly for the SEIR model with 
constant duration of both the latent and the infectious period, hereafter denoted 
by “CC”, though comments on cases where the latent, the infectious, or both are 
Markovian (MC, CM and MM, respectively) are given when useful. Again, results 
are not exact, but rather extrapolations from trends for increasing number of the 
latent classes, ng, and of infectious classes nj. Given most models in the literature 
only consider the SIR model, it is interesting to verify, in line with Theorem 
below, that the fact that pair approximations work on a tree-like structure extends 
to the additional presence of a latent period. 

The exploration is computationally intensive, given the number of classes in¬ 
volved. Therefore, some cases were dubious and we did not feel we could con¬ 
clude anything with confidence. However, they do not affect the whole picture. 

7.4. Reed-Frost-type models 

The models of Reed-Erost type (RE) represent a special case that needs to be 
treated with care. In particular, nobody is ever in the I state. We have visually 
explored moment closure on many triplet and triangle states, but we here present 
only states ESS and ESE, under the assumption that they are the dynamically 
important ones, as well as EES and EEE, to keep the parallel with the previous 
cases. Exploration of other states did not contribute in gaining further insight. The 
initially infected node (or nodes) are assumed to have just entered state E. 
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As noted in Section 2.2.4, RF-type models can be studied numerically by con¬ 
sidering an SEIR model and assuming an infectious period that is much shorter 
than the latent period. In all numerical analyses, we have used = 1 for the 
latent period and = 0.001 for the infectious one. The infection rate is ad¬ 
justed to T = 1000 to keep fixed the mean probability of transmitting across a 
link, TniT = 1. Furthermore, we investigated both the “standard” RF model with 
a constant duration of infectious period L = nix and a fixed probability P = p, de¬ 
noted by “CC” and approximated by letting both L and T being Erlang-distributed 
with decreasing variance (by increasing the number of stages ng and nj), as well 
as the cases, denoted by “MC”, “CM” and “MM”, where either L or T or both are 
exponentially distributed, respectively. 

Results are similar to those of the SI model: in particular, we found that ME 
seems to be exact also on closed triangles even when the system does not start 
within the triangle (MartiniGlass234 and BowTie345). The same behaviour ap¬ 
plies when the triangle is part of a larger motif (KiteDiagA234 and FishDiagA345, 
or KiteFull234 and FishFull345), as long as the triangle can be accessed only 
through one route. Unlike the SI model, here 1-step ME also appears to be exact 
in certain cases, though Kirkwood’s approximation still fails. Surprisingly, we 
found that most results that hold for the CC case also hold for a random latent 
period and a random transmission probability P. 

As for the SEIR model, the exploration in this case is computationally inten¬ 
sive, given the number of classes involved and, in addition, the numerical chal¬ 
lenges of having both small and very large rates simultaneously. As before, du¬ 
bious cases are highlighted, but do not affect the whole picture. Unlike the SEIR 
model, however, the exploration can be somewhat simplified by noting that, in the 
RF-CC model, many states never occur with positive probability. We carefully 
selected the times when to investigate each closure, and monitored also the prob¬ 
ability with which the system can be in those states at those times, to make sure 
results were not trivial. Figure [^reports an example of the numerical exploration 
in the dubious case of the KiteDiagB345 graph. Despite the open possibility that 
convergence for states ESS and ESE might occur if more classes could be added 
(we believe it not to, though). Figure [T^ for the KiteFull345 graph clearly shows 
the error increasing, strongly suggesting the approximation is unlikely to be exact 
in general anyway. 
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8. Extension to large networks 


8.1. Tree-like networks 

In Proposition[2we showed that moment elosure on the dynamieally important 
states ISS and ISl for the SIR model on an open triplet is exaet. Our numerieal 
exploration in Seetion|7] suggests it holds for larger networks and different models 
and Figure [TTJ) eonfirms it via simulation for the SI model with a single initial 
infeetive. In line with the above, Sharkey et al. [|T^ proved that the same results 
hold for the Markovian SIR model on any tree-like network and any number of 
initial infeetives, as long as the initial eondition is pure. 

Here we provide a mueh simpler and more intuitive proof of this results, whieh 
holds mueh more generally (in partieular for all models eonsidered here). 


Theorem 1. For any connected triplet (/, y, k) on any tree-like network, and any 
model of infection spread in which susceptibles escape infection independently 
from each of their infected neighbours and, after infection, can never return to the 
susceptible state, 

'fmAlSS)=vti(ISS) and P5^,„(/5/) = F*;(/5/). (20) 


for any t >0 and any pure initial condition x°. The result should be adapted to 
the Reed-Frost-type models by replacing I with E in ([20|). 


Proof. We initially provide the shortest proof, for whieh we only need statements 
involving nodes i,j and k. This is fully general, but we believe that explieitly 
showing how the nodes of the triplet internet with the neighbouring nodes might 
elarify the argument even further. Therefore, we later present a slightly longer 
elaboration, applied to the partieular ease of the Vine246 (Figure]^. 

Consider any pure initial eondition x®, and use the following notation to de- 
seribe events: 


Sj : node j is suseeptible at t; 

Itj : node i has been infeeted at time b < t and is still infeetious at time t. 

Note that, although not explieitly stated, both these events depend on x®. Then 

Py (75) = {Sj A 4) dt, = {Sj 14) P (/,,) dti (21) 

and 

FjkiSS)=F{Sj\Sk)F{Sk). ( 22 ) 
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Also, 


^ijkilSS) 


jy{lt,ASjASk)dt, 

fF{l,^ASj\Sk)r{Sk)dt, 

J 0 

f'y(i„\Si)v(St\Si)v(Si)Ati 
J 0 


P(5,|4)P(4)P(5,-|5,)P(5,) 


h P (5,-) 
[/o^P(5,|/,)P(/,)df,-] 
P (S,) 
Py(/5)P,^(55) 

P,(5) 

Pyfc,.(/55). 


P [Sj] 


P(5;)d?, 


[P(S,|5fc)P(5,)] 


(23) 

(24) 

(25) 

(26) 

(27) 

(28) 
(29) 


Here, the key passage is between p4[ ) and ( [25] ): eonditional on node j being 
suseeptible, the states of nodes i and k are independent. This heavily relies on the 
tree-like strueture and the assumption that individuals eannot regain suseeptible 
status after having been infeeted, so that if node j is suseeptible at time t, it has 
been so for all times from 0 to t, and this has prevented any information from 
passing from node i to k or viee versa. The step from ( [25] ) and ( |26l ) follows direetly 
from the definition of eonditional probability. 

The ISI ease follows the same steps, though the behaviour of pair (y, k) mirrors 
that of {ij) and the proof involves a double integral over both ti and t^. □ 


Proof applied to the Vine246 case. We now work out a slightly more laborious 
proof, where we explieitly consider the neighbouring nodes of the triplet. For 
this we use the notation relative to the Vine246 network (Figure [^, although it 
is clear that generalisation to any tree-like network is straightforward. We now 
write to denote the joint history of the states iip 2 ,...pn in the 

time interval [0,t], and we denote generically by the integral over all 

possible such joint histories. The passages of the proof are essentially the same as 
above, so we only focus on the ISI state. 
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We have 


P246(/>^/) = 


//f(l,3,5,7,8) 

/ 

/W(1.3,5,7,8) 

/ 

//7(1,3,5,7,8) 


(30) 

r rP(54A4A4A//(l,3,5,7,8))d72d76 (31) 

Jo Jo 

f rP(4A4A//(l,3,5,7,8)|54)P(54)d72d76 (32) 

Jo Jo 

P (4 A //( 1,3) I S 4 ) P (4 A 7/(7, 8 ) 1 54 ) P (7/(5) | S 4 ) P(54)d72d76. 

(33) 


The last passage is the key step due to independenee between all branehes sepa¬ 
rated by node 4 (i.e. rooted in nodes 2, 6 and 5). We now eonsider the separate 
faetors inside the integral. Using the definition of eonditional probability and the 
law of total probability, 


Ih{i,3) Jo 


/ P(4A77(l,3)|54)dt2 = 

'0 

(34) 

f /•'P(X4A/„AH(1,3)) 

Ml,3)4 P(54) " 

(35) 

/o/h(i,3)P(S4A4A//(1,3))*2 

P(54) 

(36) 

/o'P(54A4)d72 

P(54) 

(37) 

P24(75) 

(38) 


The term involving event It^ is dealt analogously. Instead, the term involving node 
5, for whieh no information is available, simplifies to 


Ih{5) 


P (77(5) 154 ) = 


r P(54|77(5))P(77(5)) 
Jh{5) P {S 4 ) 
4(g)P(54|77(5))P(77(5)) 


P(54) 


= 1 . 


(39) 

(40) 


Substitution of ( [38| ), its analogous for 7/g and 
result. 


into leads to the desired 

□ 
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Remark. Note that the statement used in the proof that information eannot pass 
through a suseeptible node implieitly relies on the further assumption that the 
state of a node only depends on those of its neighbours and not on the neighbour’s 
neighbours (nor on any other nodes). Some models might violate this assumption, 
although it beeomes then questionable whether a tree-like statie network is a good 
representation for sueh models. 


Remark. Also, note that in most epidemie models the infeetious life of a newly 
infeeted node evolves as an autonomous proeess: i.e. it is not affeeted by the 
neighbours or more generally by the environment. This assumption is eonvenient 
but not strietly neeessary for Theorem and ean be relaxed: for example, one 
ean imagine the states of nodes 1 and 3 in ( [3^ affeeting how and when node 2 
progresses, say, from the latent to the infeetious stage. 


Remark. Further, one ean even imagine nodes 1 and 3 affeeting the probability 
that 4 remain suseeptible up to time t. This is the ease, for example, of node 1 
being also eonneeted to 4, i.e. nodes 1, 2 and 4 forming a elosed triangle. Then the 
result of Theorem would not apply to the triplet (1,2,4), but it would still apply 
to triplets (1,4,6) and (2,4,6). More generally, the eomponents of all branehes 
stemming from node 4 need not be sub-trees. The only requirement is that the 
eomponents eontaining the first and last node of the triplet on whieh the elosure 
is applied do not eommunieate if node 4 is suseeptible. Therefore, Theoreml^ean 
be extended to that ease of triplets in whieh the eentral node is, in the terminology 
of Kiss et al. [IT3l . a cut-vertex. 


Remark. Finally, note that the need for a pure initial eondition eomes from the 
faet that if the initial eondition is random we would need to average both sides of 
the Equations in (|20l) over its distribution, thus getting, for example for state ISS, 


P246(/55) = E 


= E 


P4(5) 


(41) 


whieh is in general different from 


E[P24(/5)]E[P46(55)] 

E[P4(5)] 


P246,o(/55). 


(42) 


Before eoneluding this seetion, we further point out that, on a tree-like net¬ 
work the simplest loeal moment elosure that retain exaetness is the pair-based 
approximation. Consider in faet an open triplet with node 2 being the eentral 
node and assume that the initial state is x® = {ISS). The only natural elosure 
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for approximating the probabilities over pairs in terms of the probability of the 
states of single nodes is the one referred to as individual-based approximation in 
Sharkey [l20l . Denoting it by n, it ean be written, for example for the SI model 
and for pair (2,3) in state IS, as P23,;r(f5') = P2(f)P3(5')- However, P23,;r(f5') = 
[P(//5) +P(///)] [P(/55) +P(//5)], which simple algebra shows is in general dif¬ 
ferent from P(//5), so that the closure of the level of single nodes is not exact even 
in this simple case. 


8.2. Networks containing closed triangles 

The examples of Section [^suggest that, for the SI model starting with a single 
initial infective, Kirkwood’s and 1-step ME approximations fail everywhere, but 
ME seems to work for the dynamically important states ISS and I SI if the network 
contains triangles that do not overlap (MartiniGlass234 and BowTie345). 

In Eigure [TT] d we confirm this intuition by simulations on a large network of 
non-overlapping triangles. The results are obtained by numerically solving the set 
of ODEs, for the SI model, closed at the level of pairs using Kirkwood and ME. 
Code for the former was provided by Sharkey [|2T]| . and we provide Matlab 
code for the latter as Electronic Supplementary Material. This verifies that the 
latter, unlike the former, is able to capturing the dynamics over non-overlapping 
triangles correctly, in the case of a single initially infected node. 

The discussion of Section |7.1.2| suggests that ME does work correctly also 
in the presence of some overlapping triangles, as long as the infection is not 
allowed to enter the same triangle through two different routes simultaneously 
(e.g. ToastB234); it can though enter, then leave and re-enter (e.g. KiteDiagA), as 
long as there is only one introduction point in the triangle. However, we have also 
noticed in Section [7. 1.1 [ that the closure on an open triplet inside a loop larger than 
a triangle is in general not exact: therefore, for example on the KiteDiagA graph, 
while in the dynamics on the closed triangles 234 and 254 are handled correctly 
by ME, the closure applied on the open triplet 345 fails. The ODEs for a large net¬ 
work of Toast A motifs, however, are numerically unstable, meaning that we were 
not able to determine the exactness of the overall dynamics on large networks with 
overlapping triangles like those of Toast A or KiteDiagA motifs, which therefore 
remains an open problem. 


8.3. Further Intuition 

The details of moment closure performance are complex, but our intuition is 
that the two factors that make closures fail are simply: (1) mixed initial conditions 
and (2) random time at which recovery occurs (which can be the case even with 
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a constant duration of infection, if the time of infection is random). The former 
is already known to ereate problems [|T9l ; the latter has been highlighted here in 
Proposition!^ 

The reason why all elosures eonsidered here fail for the SIR model on the 
MartiniGlass234 graph is due to faetor (2): even with constant duration of infec¬ 
tious period, if we know that node 2 is in state I at t, we do not know when it 
recovers, as that depends on how long before t it was infected. So, for any graph 
that eontains a triangle but the initial eondition is not in the triangle, all elosures 
will fail when an individual can be in state I (i.e. not RF) and recovers after a finite 
time (i.e. not SI). 

One last eomment worth mentioning involves consideration for the absorbing 
states over triangles in the presenee of reeovery {RSS, RRS and RRR) for t ^ 
(see Figures S9 and SIO in the Supplementary Material). Unlike Kirkwood and 1- 
step ME, the elosure based on ME seems to be able to eapture the distribution over 
the absorbing states eorreetly for the MartiniGlass234 (Figure S9), even in the 
ease of the SIR and the SEIR models, i.e. when the dynamies are not themselves 
eaptured eorreetly. However, this is not the case for the ToastB234 (Figure SIO). 
Therefore, we suggest that factor (1) above eauses ME to fail in ealeulating the 
final size eorreetly, while factor ( 2 ) does not. 

8.4. Conjectures 

Given the intuition developed in all the previous seetions, for the Markovian 
SIR model with transmission rate x and reeovery rate 7 as most eommonly eon¬ 
sidered in the literature, we expeet that errors in moment elosure sehemes will be 
introdueed by the following faetors: 

1. Einite length of infeetious period. Given r/yis the only dimensionless pa¬ 
rameter in the model, we conjecture that these errors will be 0{y/ r). 

2. Eong loops and some overlapping triangles. Where there is a clustering eo- 
efficient 0 and triangles are introduced in a eombinatorially random manner, 
we conjecture such errors are 

Of course, as the epidemie spreads, errors can accumulate, so we expect the epi¬ 
demic at larger times to be less aeeurate than at smaller times. 

9. Conclusions 

We have presented here a detailed examination of the behaviour of the most 
eommonly used moment elosure approximations, with partieular attention to the 
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newly proposed approximation based on the eoneept of maximum entropy. On 
an open triplet, this approximation eoineides with the one eommonly used in the 
literature. On a elosed triangle, instead, the ME approximation is substantially 
more eomplex than the eommonly used Kirkwood approximation, but overeomes 
its fundamental theoretieal drawbaeks and, overall, seems to perform better. 

One of the interesting results from our work is that, when moving away from 
the eommonly eonsidered Markovian assumption, the perspeetive ean ehange dra- 
matieally, with all approximations being aetually exaet on the elosed triangle when 
the infeetious periods have eonstant duration (Proposition]^. This agrees with the 
intuition that we are trying to reeonstruet a joint distribution through a produet 
of marginals, whieh is likely to work only when an assumption of independenee 
holds. 

On larger networks, we have provided a simpler proof of the result of Sharkey 
et al. [|T^ eoneerning the exaetness of moment elosure for the SIR model on tree¬ 
like networks under pure initial eonditions. Our proof also extends the result to 
more general models. Coneerning larger network with elustering, the extensive 
numerieal investigation we have performed on small motifs suggests ME allows 
exaet elosure at the level of pairs on some large networks with non-overlapping 
triangles for both SI and Reed-Erost-type models when a single initial infeetive 
is present. Earge seale numerieal simulations eonfirm sueh eonelusions for the SI 
model. 

Moving on from exaetness of moment elosure to the quality of the approxi¬ 
mations still requires substantial work. Eor example, even on the simple elosed 
triangle, none of the elosure teehniques appears to be uniformly better than any 
other, and the heterogeneity of their quality over different transitions x® —)■ x sug¬ 
gests that the ehoiee of whieh one performs best will still be eontext-dependent. 
This was already notieed by Rogers IfTSlI . by showing that in a speeifie example 
on an SIR epidemie spreading on a small-world network, the ME approximation 
ean still lead to a worse overall performanee than Kirkwood’s. Rogers elaims this 
is due to a fortunate error eaneellation, where the underestimation in Kirkwood’s 
approximation of the number of suseeptibles in elosed triangles in the network is 
eompensated by its overestimation in open triplets. This appears ineorreet in light 
of Theorem]^ and the work of Sharkey et al. ifT^ . Eor a small-world network, 
however, there is non-negligible presenee of short loops larger than a triangle and 
we believe that the failure of (|^ for the open triplets that form a square is likely 
to be the aetual eause of the improved performanee of Kirkwood. 

We hope the intuition built up through this extensive exploration ean open 
many lines of thought from researehers in the epidemie modelling eommunity 


32 


and beyond. In particular we believe that it represents a valuable step in unrav¬ 
elling the assumptions behind local moment closure on networks. Without this 
understanding, there is arguably no hope to control the errors that build up in 
global moment closure approximation schemes. Given their versatility and the 
significant dimensionality reduction they can achieve, the ability to control their 
errors and to put them on a solid mathematical footing would represent a key and 
much desired methodological achievement. 

Acknowledgements 

We gratefully acknowledge the Engineering and Physical Sciences Research 
Council for funding and the two anonymous reviewers for comments that lead to 
a substantially improved version of this manuscript. 

References 

[1] Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U., 2006. 
Complex networks: Structure and dynamics. Physics Reports 424 (4-5), 
175-308. 

[2] Csiszar, L, Shields, P. C., 2004. Information Theory And Statistics: A Tuto¬ 
rial. Now Publishers Inc. 

[3] Danon, L., Ford, A. P, House, T., Jewell, C. P, Keeling, M. J., Roberts, 
G. O., Ross, J. V., Vernon, M. C., 2011. Networks and the epidemiology 
of infectious disease. Interdisciplinary Perspectives on Infectious Diseases 
2011 , 1-28. 

[4] Durrett, R., 2007. Random Graph Dynamics. Cambridge University Press. 

[5] Fames, K. T, Keeling, M. J., 2002. Modeling dynamic and network hetero¬ 
geneities in the spread of sexually transmitted diseases. Proceedings of the 
National Academy of Sciences 99 (20), 13330-13335. 

[6] Ferguson, N. M., Donnelly, C. A., Anderson, R. M., 2001. The foot-and- 
mouth epidemic in Great Britain: pattern of spread and impact of interven¬ 
tions. Science 292 (5519), 1155-60. 

[7] House, T., Keeling, M. J., 03 2010. The impact of contact tracing in clustered 
populations. PFoS Computational Biology 6 (3), el000721. 


33 



[8] House, T., Keeling, M. J., 2011. Insights from unifying modern approxi¬ 
mations to infeetions on networks. Journal of The Royal Soeiety Interfaee 
8 (54), 67-73. 

[9] House, T., et ah, 2010. Contingeney planning for a deliberate release of 
smallpox in Great Britain—the role of geographieal seale and eontaet strue- 
ture. BMC Infeetious Diseases 10, 25. 

[10] Karrer, B., Newman, M., 2010. Message passing approaeh for general epi- 
demie models. Physieal Review E 82 (1), 016101. 

[11] Keeling, M. J., Apr 1999. The effeets of loeal spatial strueture on epidemio- 
logieal invasions. Proeeedings of the Royal Soeiety B 266 (1421), 859-67. 

[12] Kirkwood, J. G., 1935. Statistical mechanics of fluid mixtures. The Journal 
of Chemical Physics 3 (5), 300-313. 

[13] Kiss, I. Z., Morris, C. G., Selley, R, Simon, P. L., Wilkinson, R. R., 2014. Ex¬ 
act deterministic representation of markovian {SIR} epidemics on networks 
with and without loops. Journal of mathematical biology, 1-28. 

[14] Neal, P, 2003. SIR epidemics on a Bernoulli random graph. Journal of Ap¬ 
plied Probability 40 (3), 779-782. 

[15] Newman, M. E. J., 2010. Networks: An Introduction. Oxford University 
Press. 

[16] Rand, D. A., 1999. Correlation equations and pair approximations for spatial 
ecologies. CWI Quarterly 12 (3&4), 329-368. 

[17] Read, J. M., Edmunds, W. J., Riley, S., Eessler, J., Cummings, D. A. T, 12 
2012. Close encounters of the infectious kind: methods to measure social 
mixing behaviour. Epidemiology and Infection 140 (12), 2117-2130. 

[18] Rogers, T., 2011. Maximum-entropy moment-closure for stochastic sys¬ 
tems on networks. Journal of Statistical Mechanics: Theory and Experiment 
2011 (05),P05007. 

[19] Sharkey, K., Kiss, L, Wilkinson, R., Simon, P, 2013. Exact equations for sir 
epidemics on tree graphs. Bulletin of mathematical biology, 1-32. 


34 



[20] Sharkey, K. J., 2008. Deterministic epidemiological models at the individual 
level. Journal of Mathematical Biology 57 (3), 311-331. 

[21] Sharkey, K. J., 2011. Deterministic epidemic models on contact networks: 
Correlations and unbiological terms. Theoretical Population Biology 79 (4), 
115-129. 

[22] Sharkey, K. J., Kiss, I. Z., Wilkinson, R. R., Simon, P. L., 2012. Exact equa¬ 
tions for SIR epidemics on unclustered networks, arXiv: 1212.2172, 

[23] Taylor, M., Simon, P. L., Green, D. M., House, T., Kiss, I. Z., Jun 2011. From 
Markovian to pairwise epidemic models and the performance of moment 
closure approximations. Journal of Mathematical Biology, 1-22. 

[24] Trapman, P, 2007. Reproduction numbers for epidemics on networks using 
pair approximation. Mathematical Biosciences 210 (2), 464-89. 

[25] Wilkinson, R. R., Sharkey, K. J., 2014. Message passing and moment clo¬ 
sure for susceptible-infected-recovered epidemics on finite networks. Physi¬ 
cal Review E 89 (2), 022808. 


35 



Figure 1: Set of states and transitions for the SIR model on the open triplet and 
closed triangle. The starting point is marked 0, and absorbing states 0 °. Open¬ 
headed arrows relate to recovery, and filled ones to transmission. All lines are 
present for the closed triangle, and either the dotted or dashed lines are absent 
depending on the initial conditions of the open triple. 
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Figure 2: Exact (★) and approximate {o,jl,K and p) probabilities for an open 
triplet (left two bars) and a closed triangle (right four bars) being in state x at time 
t = \ when starting from state x® at time t = 0, for some selected cases x® —)■ x. 
For the open triplet and the closed triangle, respectively, the exact probability is 
coloured in black while the lighter the shade of grey of each approximation, the 
larger its relative difference with the (appropriate) exact probability (black = 0%, 
white = 100%). The Markovian model is assumed, with infectivity T = 1 and 
average duration of the infectious period niT = 1. 
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Figure 3: Time integral of the sum of squared difference (SSD) between the exact 
and the approximate probability distributions over all states x of an open triplet, 
starting from each of the three states x® = (5/5), (/55) and (ISI), as a function of 
the number of infectious classes in the SIR model, for various values of the infec- 
tivity T. The v-axis is scaled so that the variance of the duration of the infectious 
period in the presence of nj (equally infectious) classes, Var(r) = l/nj, appears 
increasing linearly. 



Figure 4: Time integral of the sum of squared difference (SSD) between the exact 
probability distributions over all states x of a closed triangle and each of the three 
approximations, starting from each of the three states x® = (5/5), (/55) and (/5/), 
as a function of the number of infectious classes of the SIR model (v-axis linearly 
increasing with the variance). 
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Figure 5: Time integral of the sum of squared differenee (SSD) between the exaet 
probability distributions over all states x of a elosed triangle and eaeh of the three 
approximations, starting from eaeh of the three states x® = (5/S), (ISS) and (ISI), 
as a funetion of the infeetivity T of the Markovian SIR model. 
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Figure 6: Motifs analysed in Section 
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Figure 7: Investigation of the error in the three moment closures (Kirkwood, k; 1- 
step ME, p; and ME, p) for the SIR model on a closed triangle at time t = 0.5. The 
left axes (grey dashed lines with 5-point star markers) shows the probability that 
at t = 0.5 the system is in the state of interest. In addition to the two dynamically 
important states ISS and ISl, we plotted the results for RIS, as an example of a 
state that, in the SIR-C model occurs with negligible probability at time t = 0.5. 



Eigure 8: Investigation of the error in the three moment closures (Kirkwood, k; 
l-step ME, p; and ME, p) for the SIR model on the MartiniGlass234 network at 
time t = 1.5. The left axes (grey dashed lines with 5-point star markers) shows 
the probability that at t = 1.5 the system is in the state of interest. In addition to 
the two dynamically important states ISS and ISl, we plotted the results for RIS. 
All states occur with positive probability at t = 1.5, for all models from M to C, 
and no closure is exact. 
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X X 
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X X 

X X 

X X 



Closed 

results for p] 

Triangle 

MartiniGlas5234 

BowTie345 

/ / 

/ / 

/ / 

/ / 

/ / 

/ / 


All exact 

p and Kfail everywhere 
p and Kfail everywhere 

ToastA123 

KiteDiagA234 

FishDiagA345 

/ / 

/ / 

/ / 

/ / 

/ / 

/ / 


All exact 

p and Kfail everywhere 
p and Kfail everywhere 

4Fulll23 

KiteFull234 

FishFull345 

/ / 

/ / 

/ / 

/ / 

/ / 

/ / 


All exact 

p and Kfail everywhere 
p and Kfail everywhere 

ToastB234 

KiteDiag8345 

FishDiagB456 

X X 

X X 

X X 

X X 

X X 

X X 



4Full234 

KiteFull345 

FishFull456 
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X X 

X X 

X X 

X X 
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DoubleKite237 

DoubleFish348 

/ / 

/ / 

/ / 

/ / 


p and Kfail everywhere 
p and Kfail everywhere 


Multiple initial infective 

s (initial infectives in bracke 

ts) 


4Linel23(land4) 

4Line234 (land4) 

5Line234 (land5) 

/ / 

0 / 

/ / 

/ / 

0 / 

X X 



Treel23 (lands) 
Tree324(land5) 

Tree324 (1, 5 and 8) 

/ / 

/ 0 

/ / 

/ / 

X X 

X X 


p fails also on SIS and Sll 
p fails also on SIS and Sll 

Vine246{land8) 

/ / 

X X 



Tripod234 (1 and 6; p only) 

X X 

X X 




Table 1: Exactedness of moment elosures at the level of triplets for the SI model. 
Network names refer to Figure and are appended with the list of nodes the 
elosure is applied to. If not specified, the approximation is tested at t = 0.5. Clo¬ 
sures can: “work”, i.e. be exact (/) at the time tested, suggesting general validity; 
“fail” to be exact (X); or refer to a state that is never reached by the system (0), 
in which case they work but provide no useful information about their general 
validity. Grey areas highlight test results that provide key understanding and that 
are discussed in the main text. 
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SIR-C model 

Dynamically important 

Dynamically not important 

Time 

Notes 1 

ISS ISI 

IIS III 


Triplet 

/ 0 

/ / 


ISS Is exact also for M; IIS and III are not 

4Linel23 

4Line234 

5Line234 

/ 0 

/ 0 

/ 0 

/ / 

/ / 

/ / 

1.5 

ISS and IIS are exact also for M; III is not 

ISS and IIS are exact also for M; III is not 

ISS and IIS are exact also for M; III is not 

3Starl23 

Treel23 

Tree248 

/ 0 

/ / 


ISS and IIS are exact also for M; III is not 

3st3r324 

Tree324 

0 0 

X X 

1.5 

Fails also on SIS and Sll, for both C and M 

Squarel23 

KiteEmpty234 

FishEmpty345 

/ / 

/ / 

/ / 

X X 

1.5 

All fail for M 

All fail for M 

ToastB123 

KiteDiagB234 

FishDiagB345 

/ / 

X X 

/ / 

X X 

1.5 

All fail for M 

All fail for M 

Square234 

KiteEmpty345 

FishEmpty456 

/ / 

X X 


All fail for M 

ToastA234 

KiteDiagA345 

FishDiagA456 

/ / 

X X 


All fail for M 

Closed (results for p) 

Triangle 

MartiniGlass234 

BowTie345 

/ / 

X X 

/ / 

X X 

1.5 

All exact for C, but all fail for M 

All fail for both C and M; Exceptions: p exact for C (but not M) 
when t < mj and for M and C in final states (RSS, RRS, RSR, RRR) 

ToastA123 

KiteDiagA234 

FishDiagA345 

/ / 

X X 

/ / 

X X 

1.5 

All exact for C, but all fail for M 

All fail for both C and M; Exceptions: p exact for C (but not M) 
when t < mj and for M and C in final states (RSS, RRS, RSR, RRR) 

4Fulll23 

KiteFull234 

FishFull345 

/ / 

/ / 


All exact for C, but all fail for M 

ToastB234 

KiteDiagB345 

FishDiagB456 





4FUII234 

KiteFull345 

FishFull456 





DoubleKite237 

DoubleFish348 





Multiple initial infectives (initial infectives in brackets) 

4Linel23(land4) 

4Line234(land4) 

5Line234(land5) 





Treel23(land5) 

Tree324 (1 and 5) 
Tree324(l, 5and8) 





Vine246 (1 and 8) 





Tripod234 (1 and 6; p only) 






Table 2: Exactedness of moment elosures at the level of triplets for the SIR model 
with a eonstant duration of the infeetious period (C). Comments for the Markovian 
model (M) with exponentially distributed duration of infeetion are also reported 
when useful. Time of test is t = 0.5 when not stated. Only the interesting results 
are reported. Symbols and table strueture are as per Table 
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Figure 9: Investigation of the error in the three moment closures (Kirkwood, k; 
1-step ME, p; and ME, p) for the RE model on the KiteDiagB345 network at 
time t = 2.5. The left axes (grey dashed lines with 5-point star markers) shows 
the probability that at t = 2.5 the system is in the state of interest. Note that state 
EES never occur with positive probability, because individuals 3 and 4 can never 
be infected at the same time: if 4 is in the E state, 3 was either a potential infector 
(and so is now in the R state) or has escaped the infection from 2 and is therefore in 
state S. Also note the dubious convergence, that is difficult to investigate because 
of the computational cost involved. 


ESSSS ^ ESS 


ESSSS ^ ESE 


ESSSS ^ EES 





Eigure 10: Investigation of the error in the three moment closures (Kirkwood, k; 
l-step ME, p; and ME, p) for the RE model on the KiteEull345 network at time 
t = 2.5. The left axes (grey dashed lines with 5-point star markers) shows the 
probability that at t = 2.5 the system is in the state of interest. Note now the clear 
lack of convergence. 
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Reed-Frost-CC model 

Dynamically important 

Dynamically not important 

Time 

Notes 

ESS ESE 

EES EEE 






Triplet 





4Linel23 

4Line234 

5Line234 





3Starl23 

Treel23 

Tree248 





3star324 

Tree324 





Squarel23 

KiteEmpty234 

FishEmpty345 

/ 0 

/ 0 

0 0 

0 0 

0.5 

1.5 

ESS exact also for MM, CM and MC 

ESS exact also for MM, CM and MC 

ToastB123 

KiteDiagB234 

FishD(agB345 





Square234 

KiteEmpty345 

FishEmptv456 

/ / 

/ / 

0 0 

0 0 

1.5 

2.5 

ESS exact also for MM, CM and MC; anything "derived" from ESE 
(e.g. RSR, RER, RRR) fails too 

ToastA234 

KiteDiagA345 

FishDiagA456 






Triangle 

MartiniGla5s234 

BowTie345 

ToastA123 

KiteDiagA234 

FishDiagA345 


Closed (results for p.) 


ESS works also for p (also for MM, MC and CM), but all other states (RES, REE, 
etc.) work only for p (again, also for MM, MC and CM); most states are 0 as 
nodes can only be infected sequentially (cannot be E at the same time) 

ESS works also for p (also for MM, MC and CM), but all other states (RES, REE, 
etc.) work only for p (again, also for MM, MC and CM); most states are 0 as 
nodes can only be infected sequentially (cannot be E at the same time) 


4Fulll23 

KiteFull234 

FishFull345 


ESS works also for p (also for MM, MC and CM), but all other states (RES, REE, 
etc.) work only for p (again, also for MM, MC and CM); most states are 0 as 
nodes can only be infected sequentially (cannot be E at the same time) 


ToastB234 

KiteDiagB345 

FishDiagB456 


Maybe 

Maybe 


Maybe 

Maybe 


Maybe p also works for ESS and ESE (CC only, all other fail), but K fails at least 
on ESS. All closures fail also for CC on all other states that occur with positive 
probability, including final states (RSS, RSR, RRR...) 


|4Full234 

KiteFull345 

FishFull456 

DoubleKite237 

DoubleFish348 

4Linel23(land 4) 
4Line234 (1 and 4) 
5Line234 (1 and 5) 


All closures fail for all models on all other states that occur with positive 
probability, including final states (RSS, RSR, RRR...). 


Multiple initial infectivi 


es (initial infectives in brackets) 


Numerical errors (le-9) make closure look exact for MM, but not for CC; 
however. Theorem 1 guarantees the closure is exact 


Treel23(land 5) 
Tree324(land 5) 
Tree324 (l, SandS) 


Vine246(land8) 


Tripod234 (1 and 6; p only) 


Table 3: Exactedness of moment elosures at the level of triplets for the Reed-Frost 
model with a eonstant duration of the latent period and non-random probabiliry 
P = poi transmission (CC). Comments for exponentially distributed latent period 
or geometrieally distributed probability of tranmission P, or both (MC, CM or 
MM, respeetively) are also reported when useful. Time of test is t = 0.5 when not 
stated. Only the interesting results are reported. Symbols and table strueture are 
as per Table 
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Figure 11: SI dynamics on larger networks, (a) A tree network, (b) Mean numbers 
infective over time at different Rank (distance from the central node) for the tree 
network, for Monte Carlo simulation (markers) and exact ODE models (lines), 
(c) A tree-of-triangles network, (d) Mean numbers infective over time at differ¬ 
ent Rank for the tree-of-triangles network, for Monte Carlo simulation (markers), 
inexact Kirkwood ODEs (grey lines), and exact Maximum Entropy ODEs (black 
lines). 
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Exact and approximate moment closures for 
non-Markovian network epidemics 

Supplementary Material 

Lorenzo Pellis Thomas House Matthew J. Keeling 

Open triplet 

Figure 2 of the main text hides in an aggregate measure most of the hetero¬ 
geneity in the performance of the standard approximation for an open triplet. Fig¬ 
ure [^unravels some of this heterogeneity, revealing positive and negative errors 
in different cases, exactness in others and a particularly poor performance when 
starting from state x® = [SIS). Figure [Ts] plots the same results as a function of 
the infection rate T. 


Closed triangle 


The complex behaviour of the three approximations in all different cases makes 
it difficult to have a full overview of their accuracy. Here we finally present an al¬ 


most exhaustive list of all interesting x cases. Figures and 15 shows 


the error e* (x;t) as a function of time for the Markovian model with infectiv- 
ity T = 1. Note, as already observed in the main text, how ME performs poorly 
compared to Kirkwood’s for the case [ISS) [ISS), how Kirkwood’s approxi¬ 
mation is strongly inaccurate for [ISS) — [III) and how both fail to capture cor¬ 
rectly the case [ISS) [RSS) (although the relative performance of ME improves 
dramatically for larger values of t; not shown). Note also how both ME and 
Kirkwood’s approximations give the same results for the cases [ISS) —)■ [ISR) and 
[ISS) —)■ [IRS) as they are symmetrical, but l-step ME does not. Eurther explo¬ 
ration of how 1-step ME performs when reaching the same state x from all three 
initial states x® = [ISS ), [SIS) and [SSI) reveals always an identical behaviour in 
two out of the three cases, and a different behaviour for the third one. Errors ob¬ 
tained when starting from x° = [ISI) are significantly smaller than when starting 
from a single initial infective (Eigure [Tb] ). More strikingly, even though Kirk¬ 
wood’s approximation appears to be quite inaccurate in general, it turns out to be 
exact when in the particular cases of x = [ISI), [ISR) (and thus [RSI)) and [RSR), 
when starting from x° = [ISI). 


A1 



The heterogeneous behaviour highlighted by Figure 5 in the main text suggests 
that ME, t houg h better than the other approximations in general, is not uniformly 


so. Figure 17 explores how the time integral of the absolute error |e* (x;t)| de¬ 


pends on T in various oases of interest. Note, first of all, how all errors eon- 
verge to 0 for large T. Seoond, note how ME is markedly inaeeurate in the ease 
(ISS) —)■ (ISS), how Kirkwood’s performs poorly for (ISS) —)■ (RSS) and (RRR) 
while it is exaot for {ISI) —)■ {ISI), and how ME performs badly oompared to 
Kirkwood’s for small T in the ease (ISS) — (RRS). Finally we report in Figure 
[T^the stratified oontribution to the overall SSD measure for eaeh approximation 
and eaeh starting point x° of interest. As highlighted in the main text, in addition 
to showing a quantitatively smaller diserepaney, ME seems to be always balane- 
ing the diserepaney between exaet results and approximations more evenly aeross 
states and in time. 
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Supplementary Figures 



Number of I classes 


Figure 12: Error {x;t) = P***(x;?) — P***(x;?) between the exact and approxi¬ 
mate probabilities, in the SIR model, of an open triplet being in state x at time 
t = I, when starting from state x® at time t = 0, for various choices of x® and x, as 
a function of the number of infectious classes (x-axis linearly increasing with the 
variance), for various values of the infectivity T. Note the different scale of the y 
axis of the bottom row. 
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Probability difference at t 



012345012345012345 


Infectivity (t) 

Figure 13: Error {x;t) = P**^(x;f) — P*”(x;f) between the exact and approxi¬ 
mate probabilities, in the SIR model, of an open triplet being in state x at time 
t = I, when starting from state x® at time t = 0, for various choices of x® and x, as 
a function of the infectivity T, for various number of infectious classes. Note the 
different scale for the y axis on the bottom row. 
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ISS ^ ISS ISS ^ ISI ISS ^ ISR 



0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 


ISS ^ IRS ISS ^ IRI ISS ^ IRR 



0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 


Time 

Figure 14: Error ef {x\t) = P***(x;?) — P***(x;?) between the exact and approxi¬ 
mate probabilities of a closed triangle being in state x as a function of time, when 
starting from state x® at time t = 0, for various choices of x® and x. The model is 
Markovian SIR with infectivity T = 1. 
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ISS ^ RSS ISS — RSI ISS ^ RSR 



Time 

Figure 15: Error ef {x\t) = P***(x;?) — P***(x;?) between the exact and approxi¬ 
mate probabilities of a closed triangle being in state x as a function of time, when 
starting from state x® at time t = 0, for various choices of x® and x. The model is 
Markovian SIR with infectivity T = 1. 
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Time 


Figure 16: Error ef {x\t) = P***(x;?) — P***(x;?) between the exact and approxi¬ 
mate probabilities of a closed triangle being in state x as a function of time, when 
starting from state x® at time t = 0, for various choices of x® and x. The model is 
Markovian with infectivity T = 1. 
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ISS ^ ISS ISS ^ IIS ISS ^ III 



Figure 17: Time integral of the modulus of the difference ef (x;t) = P*°(x;t) — 
P* (x; t) between the exact and approximate probabilities of a closed triangle be¬ 
ing in state x, when starting from state x® at time t = 0, as a function of T. The 
Markovian SIR model is assumed. 
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Figure 18: Errors ef {x-,t) = P*"(x;?) — P*"(x;?) (c = K,p and /i) between the 
exact and approximate probabilities of an open triplet being in state x at time 
t = I, when starting from state x® at time t = 0, for some selected choices of x*^ 
and X, as a function of the number of infectious classes (;c-axis linearly increasing 
with the variance), for infectivity T = 1. 
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)-3 From SIS using p 


From ISS using p 


Time 



Figure 19: Stratified eontribution of eaeh state x to the overall diserepaney be¬ 
tween the exaet distribution over system states and eaeh of the approximations 
(top row: Kirkwood’s approximation; middle row: 1-step ME; third row: full 
ME), for different starting points x® (one per eolumn). The model is Markovian 
with infeetivity T = 1. 
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SEIR-CC model 

Dynamically important 

Dynamically not important 

Time 

Notes 

ISS ISI 

IIS III 





Open 

Triplet 

/ 0 

0 0 

1.5 

ISS is also exact for MM, MC and CM; ISI always 0; IIS and III are non-0 (and fail) for MM, MC and CM 

4Linel23 

4Line234 

5Line234 

/ 0 

/ 0 

/ 0 

0 0 

0 0 

0 0 

1.5 

2.5 

2.5 

ISS is also exact for MM, MC and CM; ISI always 0; IIS and III are non-0 (and fail) for MM, MC and CM 

ISS is also exact for MM, MC and CM; ISI always 0; IIS and III are non-0 (and fail) for MM, MC and CM 

ISS is also exact for MM, MC and CM; ISI always 0; IIS and III are non-0 (and fail) for MM, MC and CM 

3Starl23 

Treel23 

Tree248 

/ 0 

0 0 

1.5 

ISS is also exact for MM, MC and CM; ISI always 0; IIS and III are non-0 (and fail) for MM, MC and CM 

3star324 

Tree324 





Squarel23 

KiteEmpty234 

FishEmpty345 

/ 0 

0 0 

1.5 

ISS fails for M; ISI, IIS and III are non-0 (and fail) for MM, MC and CM 

ToastB123 

KiteDlagB234 

FishDiagB345 





Square234 

KiteEmpty345 

FishEmpty456 

Maybe X 

0 0 

2.5 

ISS and ISI fail for MM, MC and CM; IIS and III are non-0 (and fail) for MM, MC and CM 

ToastA234 

KiteDiagA345 

FishDiagA456 





Closed (results for p) 

Triangle 

MartiniGlass234 

BowTie345 

Maybe 0 

0 0 

1.5 

ISS fails for M; ISI, IIS and III are non-0 (and fail) for MM, MC and CM 

ToastA123 

KiteDiagA234 

FishDiagA345 

Maybe 0 

0 0 

1.5 

ISS fails for M; ISI, IIS and III are non-0 (and fail) for MM, MC and CM 

4FUII123 

KiteFull234 

FishFull345 





ToastB234 

KlteDiagB345 

FishDiagB456 





4FUII234 

KiteFull345 

FishFull456 





DoubleKite237 

DoubleFish348 







Multiple initial 

infect! 

ves (initial infectives in brackets) 

4Linel23(land 4) 

4Llne234 (1 and 4) 

5Line234 (1 and 5) 





Treel23 (1 and 5) 

Tree324 (1 and 5) 

Tree324 (1, 5 and 8) 





Vine246 (1 and 8) 





Tripod234 (1 and 6; g only) 






Table 4: Exactedness of moment elosures at the level of triplets for the SEIR 
model with a eonstant duration of the latent and the infeetious periods (CC). Com¬ 
ments for the models where either the latent of the infeetious period, or both (MC, 
CM or MM, respeetively) have exponentially distributed duration are also reported 
when useful. Time of test is t = 0.5 when not stated. Only the interesting results 
are reported. Symbols and table strueture are as per Table 1 in the main text. 
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ISSS ^ RSS 


ISSS ^ RRS 


ISSS ^ RRR 




Figure 20: Investigation of the error in the three moment closures (Kirkwood, k; 
1-step ME, p; and ME, p) for the SIR model on the MartiniGlass234 in its absorb¬ 
ing states for t —)■ oo. The left axes (grey dashed lines with 5-point star markers) 
shows the probability that the system ultimately ends in the state of interest. Note 
the exactedness of ME as opposed to the other closures (see main text). 


ISSS ^ RSS 


ISSS ^ RRS 


ISSS ^ RRR 



Number of I classes 



Eigure 21: Investigation of the error in the three moment closures (Kirkwood, k; 
l-step ME, p; and ME, p) for the SIR model on the ToastB234 in its absorbing 
states for t ^ The left axes (grey dashed lines with 5-point star markers) shows 
the probability that the system ultimately ends in the state of interest. Note how 
all closure fails (see main text). 
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Supplementary Code 

Main Function 


function [Time,Y]= pair_based_me(T,g,10,TSPAN,tol,niter) 

7« Modified version of code from Sharkey (2011) to use Maximum Entropy 
7« rather than Kirkwood closure 

T=T’ ; 

N=length(T(:,1)); 

I_vec=zeros(N,1);I_vec(I0)=l; 

S_vec=ones(N,1);S_vec(10)=0; 

I_mat=spdiags(I_vec,0,N,N); 

S_mat=spdiags(S_vec,0,N,N); 

G=spones(T); 

G_A=G.*(l-GO; 

H=G+G_A^; 

Q=min(H“2,1);Q=Q-diag(diag(Q)); 

H_c=Q.*H; 

H_o=Q-H_c; 

F_AB=H; 

F_AA=tril(H); 

IS=I_mat*F_AB*S_mat; 

SS=S_mat*F_AA*S_mat; 

II=I_mat*F_AA*I_mat; 

W_AB=find(reshape(F_AB,N‘2,1)); 

W_AA=find(re shape(F_AA,N“2,1)); 
d_AB=length(W_AB); 
d_AA=length(W_AA); 

YO=[S_vec;I_vec;IS(W_AB);SS(W_AA);II(W_AA)]; 
options=odeset(’abstol',tol(l),’reltol^,toI(2)); 

[Time,Y]=ode23(@modeI_function,TSPAN,Y0,options,G,T,H_c,H_o,g,N,W_AA,W_AB,d_AA,d_AB,niter); 
end 
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ODE function 


function dY = model_functionC*',Y0 ,G,T,H_c ,H_o ,g,N,W_AA ,W_AB ,d_AA, d_AB,niter) 

IS=spalloc(N,N,d_AB); 

SS=spalloc(N,N,d_AB); 

II=spalloc(N,N,d_AB); 

S=Y0(1:N); 

I=Y0(N+1:2*N); 

IS(W_AB)=Y0(2*N+1:2*N+d_AB); 

SS(W_AA)=Y0(2*N+d_AB+l:2*N+d_AB+d_AA); 

II(W_AA)=Y0(2*N+d_AB+d_AA+l:2*N+d_AB+2*d_AA); 

SS=SS+SS^ 

II=II+II^; 

inv_S=spdiags(spfun(@inve,S),0,N,N); 
inv_I=spdiags(spfun(@inve,I),0,N,N); 

R=T.*IS; 


7, Other code is unchanged from Sharkey (2011); the below runs niter iterations 
7 of the interative method for calculating the maximum entropy distribution 

IrSrS = R’*H_o.*(inv_S*SS); 

IrSlI = IS*inv_S.*(H_o*R); 

[ii,jj] = find(H_c); 
for t=l:length(ii) 

i=ii(t); j=jj(t); 

kk = full(intersect(flnd(G(i,:)),find(G(j,:)))); 
for k=kk 

aP12 = [full(SS(i,j)) , full(lS(j ,i)) ; full (lS(i, j ) ) , fulKll (i , j ) )] ; 

aP23 = [full(SS(j ,k)) , full (IS (k, j ) ) ; full (IS( j ,k) ) , fulKlI (j ,k) )] ; 

aP13 = [full(SS(i,k)) , f ull (IS (k, i) ) ; full (IS(i ,k) ) , fulKlI (i ,k) )] ; 

Tri = MaxlmumEntropy(aP12, aP23, aP13, 2, niter); 

IrSrS(i,j) = IrSrS(i,j) + Tri(l,l,2); 

IrSlI(i,j) = IrSlI(i,j) + Tri(2,l,2); 

end 

end 

SrSlI=IrSrS’; 

IrSrI=IrSlI’; 

dT=suiii(R) ’ ; 

dS=-dT; 

dI=dT-g*I; 

dSS=-IrSrS-SrSlI; 

dIS=IrSrS-IrSlI-R-g*IS; 

dII=IrSlI+IrSrI+R+R’-2*g*II; 

dY=[dS;dl;dIS(W_AB);dSS(W_AA);dll(W_AA)]; 

end 
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Helper functions 

function Ptriplet = MaximumEntropy(P12,P23,P13,ns,niter) 


P = zeros(ns,ns,ns,niter); 
Pteinp_old = zeros (ns, ns, ns); 
Pteinp_new = zeros (ns, ns, ns); 
P(: , : . :,1) = l/ns~3; 


for ni = 2:niter 

Ptemp_old = P(:,:,:,ni-l); 
for gl = l:ns 

for g2 = l:ns 

for g3 = 1:ns 

den = sum(Ptemp_old(gl,g2,:)); 
if den == 0 

Ptemp_new(gl,g2,g3) = 0; 

else 

Ptemp_new(gl,g2,g3) = P12(gl,g2) * Pteinp_old(gl ,g2 ,g3) / den 

end 

end 

end 

end 

testtemp = suin(sum(suin(Pteinp_new))); 

Ptemp_old = Pteinp_new; 
for gl = l:ns 

for g2 = l:ns 

for g3 = 1:ns 

den = sum(Ptemp_old(:,g2,g3)); 
if den == 0 

Ptemp_new(gl,g2,g3) = 0; 

else 

Ptemp_new(gl,g2,g3) = P23(g2,g3) * Ptemp_old(gl,g2,g3) / den 

end 

end 

end 

end 

testtemp = sum(sum(sum(Ptemp_new))); 

Ptemp_old = Ptemp_new; 
for gl = l:ns 

for g2 = l:ns 

for g3 = 1:ns 

den = sum(Ptemp_old(gl,:,g3)); 
if den == 0 

Ptemp_new(gl,g2,g3) = 0; 

else 

Ptemp_new(gl,g2,g3) = P13(gl,g3) * Pteinp_old(gl ,g2 ,g3) / den 

end 

end 

end 

end 

testtemp = sum(sum(sum(Ptemp_new))); 

P(:,:,:,ni) = Ptemp_new; 

end 

Ptriplet = P(:,:,:,niter); 
end 
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function M_out=inve(M_in) 

M_out=M_in.“(-1); 

end 
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